Chapter 31

The Organization of Knowledge

Much of biology has traditionally been concerned with the classification of objects,

especially of course organisms, the best known example probably being Carl Lin-

naeus’ Systema Naturae, first published in 1735. As knowledge has continued to

expand, the desire to classify has also spread to bioinformatics and its objects: genes

and other DNA sequences, proteins, and other molecules. As the number of objects

stored in databases has grown, some kind of systematization has been seen as essen-

tial to aid database searches. Unfortunately, most classification almost inevitably

results in distortion, and the more rigid the classification, the more severe the distor-

tion. Linnaeus himself considered that his classification was to some extent artificial.

The only admissible classifying arrangement of collections of objects should be

that which respects the principle of maximum entropy: that arrangement should be

selected, which imposes fewest assumptions upon the data. 1 Here, these issues can

only be very briefly discussed; the main purpose is to alert the reader to the dangers

of classification and encourage a cautious approach to its adoption. As Sommerhoff

(1950) has pointed out, “Biologists have been too keen to explain things before they

were able to state in exact terms what they wanted to explain,” and aptly mentions

Quine’s remark, “that the less a science is advanced, the more its terminology tends

to rest on the uncritical assumption of mutual understanding”. Ontologies (in the spe-

cific sense of Footnote 4) are an obvious attempt to achieve mutual understanding,

but at the price of an overly rigid structure that, given the very incomplete state of

our knowledge in the field, will surely tend to hinder its further development. Just as

the formation of bone requires both osteoblasts and osteoclasts, so does the growth

1 A particularly glaring example of disrespect toward this principle is to be found in the current

fashion among museum curators to ceaselessly rearrange their collections in order to demonstrate

some preconceived idea or another, whereas, ideally, the exhibits should be displayed in an unstruc-

tured manner, in order to allow the thoughtful visitor to draw his or her own conclusions from the

raw evidence. Only in that way can new knowledge (conditional information) be generated through

the perception of new, hitherto unperceived, relationships.

© Springer Nature Switzerland AG 2023

J. Ramsden, Bioinformatics, Computational Biology,

https://doi.org/10.1007/978-3-030-45607-8_31

383